Journals
  Publication Years
  Keywords
Search within results Open Search
Please wait a minute...
For Selected: Toggle Thumbnails
High-precision entity and relation extraction in medical domain based on pseudo-entity data augmentation
Andi GUO, Zhen JIA, Tianrui LI
Journal of Computer Applications    2024, 44 (2): 393-402.   DOI: 10.11772/j.issn.1001-9081.2023020143
Abstract209)   HTML3)    PDF (4228KB)(117)       Save

Aiming at the problems of dense knowledge and the propagation of error during entity extraction and relation classification in medical domain, a high-precision entity and relation extraction framework based on pseudo-entity data augmentation was proposed. First, a Transformer-based feature reading unit was added in the entity extraction module to capture category information for accurately identifying medical long entities among dense entities. Second, a relation negative example generation module was inserted into the pipeline extraction framework, pseudo-entities were generated for confusing relation classification model by an under-sampling-based pseudo-entity generation model, and three data augmentation generation strategies were proposed to improve the model’s ability to identify subject-object reversal, subject-object boundary errors, and relation classification errors. Finally, the problem of the sharp increase in training time caused by data enhancement was alleviated by the levitated-marker-based relation classification model. On CMeIE dataset, four mainstream models were compared with the proposed model. For entity extraction tasks, the proposed model improved the F1 value by 2.26% compared with suboptimal model PL-Marker(Packed Levitated Marker), while for entity relation extraction tasks, the proposed medel improved the F1 value by 5.45% and the precision by 15.62% compared with suboptimal pipeline extraction model proposed by CBLUE (Chinese Biomedical Language Understanding Evaluation). The experimental results show that using both the feature reading unit and the pseudo-entity data enhancement module can effectively improve the precision of extraction.

Table and Figures | Reference | Related Articles | Metrics